AITopics | ai red team

Collaborating Authors

ai red team

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Firewalls to Frontiers: AI Red-Teaming is a Domain-Specific Evolution of Cyber Red-Teaming

Sinha, Anusha, Grimes, Keltin, Lucassen, James, Feffer, Michael, VanHoudnos, Nathan, Wu, Zhiwei Steven, Heidari, Hoda

arXiv.org Artificial IntelligenceSep-16-2025

A red team simulates adversary attacks to help defenders find effective strategies to defend their systems in a real-world operational setting. As more enterprise systems adopt AI, red-teaming will need to evolve to address the unique vulnerabilities and risks posed by AI systems. We take the position that AI systems can be more effectively red-teamed if AI red-teaming is recognized as a domain-specific evolution of cyber red-teaming. Specifically, we argue that existing Cyber Red Teams who adopt this framing will be able to better evaluate systems with AI components by recognizing that AI poses new risks, has new failure modes to exploit, and often contains unpatchable bugs that re-prioritize disclosure and mitigation strategies. Similarly, adopting a cybersecurity framing will allow existing AI Red Teams to leverage a well-tested structure to emulate realistic adversaries, promote mutual accountability with formal rules of engagement, and provide a pattern to mature the tooling necessary for repeatable, scalable engagements. In these ways, the merging of AI and Cyber Red Teams will create a robust security ecosystem and best position the community to adapt to the rapidly changing threat landscape.

ai system, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.11398

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Lessons From Red Teaming 100 Generative AI Products

Bullwinkel, Blake, Minnich, Amanda, Chawla, Shiven, Lopez, Gary, Pouliot, Martin, Maxwell, Whitney, de Gruyter, Joris, Pratt, Katherine, Qi, Saphir, Chikanov, Nina, Lutz, Roman, Dheekonda, Raja Sekhar Rao, Jagdagdorj, Bolor-Erdene, Kim, Eugenia, Song, Justin, Hines, Keegan, Jones, Daniel, Severi, Giorgio, Lundeen, Richard, Vaughan, Sam, Westerhoff, Victoria, Bryan, Pete, Kumar, Ram Shankar Siva, Zunger, Yonatan, Kawaguchi, Chang, Russinovich, Mark

arXiv.org Artificial IntelligenceJan-13-2025

In recent years, AI red teaming has emerged as a practice for probing the safety and security of generative AI systems. Due to the nascency of the field, there are many open questions about how red teaming operations should be conducted. Based on our experience red teaming over 100 generative AI products at Microsoft, we present our internal threat model ontology and eight main lessons we have learned: 1. Understand what the system can do and where it is applied 2. You don't have to compute gradients to break an AI system 3. AI red teaming is not safety benchmarking 4. Automation can help cover more of the risk landscape 5. The human element of AI red teaming is crucial 6. Responsible AI harms are pervasive but difficult to measure 7. LLMs amplify existing security risks and introduce new ones 8. The work of securing AI systems will never be complete By sharing these insights alongside case studies from our operations, we offer practical recommendations aimed at aligning red teaming efforts with real world risks. We also highlight aspects of AI red teaming that we believe are often misunderstood and discuss open questions for the field to consider.

language model, opération, vulnerability, (17 more...)

arXiv.org Artificial Intelligence

2501.07238

Country:

North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > Indonesia > Bali (0.04)
Africa > Eswatini > Manzini > Manzini (0.04)

Genre:

Overview (0.68)
Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (0.69)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.81)

Add feedback

Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle

Haider, Emman, Perez-Becker, Daniel, Portet, Thomas, Madan, Piyush, Garg, Amit, Ashfaq, Atabak, Majercak, David, Wen, Wen, Kim, Dongwoo, Yang, Ziyi, Zhang, Jianwen, Sharma, Hiteshi, Bullwinkel, Blake, Pouliot, Martin, Minnich, Amanda, Chawla, Shiven, Herrera, Solianna, Warreth, Shahed, Engler, Maggie, Lopez, Gary, Chikanov, Nina, Dheekonda, Raja Sekhar Rao, Jagdagdorj, Bolor-Erdene, Lutz, Roman, Lundeen, Richard, Westerhoff, Tori, Bryan, Pete, Seifert, Christian, Kumar, Ram Shankar Siva, Berkley, Andrew, Kessler, Alex

arXiv.org Artificial IntelligenceAug-22-2024

Recent innovations in language model training have demonstrated that it is possible to create highly performant models that are small enough to run on a smartphone. As these models are deployed in an increasing number of domains, it is critical to ensure that they are aligned with human preferences and safety considerations. In this report, we present our methodology for safety aligning the Phi-3 series of language models. We utilized a "break-fix" cycle, performing multiple rounds of dataset curation, safety post-training, benchmarking, red teaming, and vulnerability identification to cover a variety of harm areas in both single and multi-turn scenarios. Our results indicate that this approach iteratively improved the performance of the Phi-3 models across a wide range of responsible AI benchmarks. Finally, we include additional red teaming strategies and evaluations that were used to test the safety behavior of Phi-3.5-mini and Phi-3.5-MoE, which were optimized for multilingual capabilities.

benchmark, language model, phi-3, (15 more...)

arXiv.org Artificial Intelligence

2407.13833

Country: South America > Brazil (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (0.93)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

Microsoft's AI Red Team Has Already Made the Case for Itself

WIREDAug-7-2023, 17:51:37 GMT

For most people, the idea of using artificial intelligence tools in daily life--or even just messing around with them--has only become mainstream in recent months, with new releases of generative AI tools from a slew of big tech companies and startups, like OpenAI's ChatGPT and Google's Bard. But behind the scenes, the technology has been proliferating for years, along with questions about how best to evaluate and secure these new AI systems. On Monday, Microsoft is revealing details about the team within the company that since 2018 has been tasked with figuring out how to attack AI platforms to reveal their weaknesses. In the five years since its formation, Microsoft's AI red team has grown from what was essentially an experiment into a full interdisciplinary team of machine learning experts, cybersecurity researchers, and even social engineers. The group works to communicate its findings within Microsoft and across the tech industry using the traditional parlance of digital security, so the ideas will be accessible rather than requiring specialized AI knowledge that many people and organizations don't yet have.

ai red team, microsoft, red team, (2 more...)

WIRED

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.57)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.57)

Add feedback

Vulnerabilities May Slow Air Force's Adoption of Artificial Intelligence

#artificialintelligenceSep-24-2021, 14:50:11 GMT

The Air Force needs to better prepare to defend AI programs and algorithms from adversaries that may seek to corrupt training data, the service's deputy chief of staff for intelligence, surveillance, reconnaissance and cyber effects said Wednesday. "There's an assumption that once we develop the AI, we have the algorithm, we have the training data, it's giving us whatever it is we want it to do, that there's no risk. There's no threat," said Lt. Gen. Mary F. O'Brien, the Air Force's deputy chief of staff for intelligence, surveillance, reconnaissance and cyber effects operations. That assumption could be costly to future operations. Speaking at the Air Force Association's Air, Space and Cyber conference, O'Brien said that while deployed AI is still in its infancy, the Air Force should prepare for the possibility of adversaries using the service's own tools against the United States.

adversary, air force, artificial intelligence, (14 more...)

#artificialintelligence

Country:

North America > United States > Idaho > Ada County > Boise (0.06)
Asia > China (0.06)

Industry: Government > Military > Air Force (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback